transfer learning
- North America > United States (0.14)
- Oceania > Australia > New South Wales (0.04)
- North America > Canada (0.04)
Pengi: An Audio Language Model for Audio Tasks
In the domain of audio processing, Transfer Learning has facilitated the rise of Self-Supervised Learning and Zero-Shot Learning techniques. These approaches have led to the development of versatile models capable of tackling a wide array of tasks, while delivering state-of-the-art performance. However, current models inherently lack the capacity to produce the requisite language for open-ended tasks, such as Audio Captioning or Audio Question Answering. We introduce Pengi, a novel Audio Language Model that leverages Transfer Learning by framing all audio tasks as text-generation tasks. It takes as input, an audio recording, and text, and generates free-form text as output.
Adapting to Change: A Comparison of Continual and Transfer Learning for Modeling Building Thermal Dynamics under Concept Drifts
Raisch, Fabian, Langtry, Max, Koch, Felix, Choudhary, Ruchi, Goebel, Christoph, Tischler, Benjamin
Transfer Learning (TL) is currently the most effective approach for modeling building thermal dynamics when only limited data are available. TL uses a pretrained model that is fine-tuned to a specific target building. However, it remains unclear how to proceed after initial fine-tuning, as more operational measurement data are collected over time. This challenge becomes even more complex when the dynamics of the building change, for example, after a retrofit or a change in occupancy. In Machine Learning literature, Continual Learning (CL) methods are used to update models of changing systems. TL approaches can also address this challenge by reusing the pretrained model at each update step and fine-tuning it with new measurement data. A comprehensive study on how to incorporate new measurement data over time to improve prediction accuracy and address the challenges of concept drifts (changes in dynamics) for building thermal dynamics is still missing. Therefore, this study compares several CL and TL strategies, as well as a model trained from scratch, for thermal dynamics modeling during building operation. The methods are evaluated using 5--7 years of simulated data representative of single-family houses in Central Europe, including scenarios with concept drifts from retrofits and changes in occupancy. We propose a CL strategy (Seasonal Memory Learning) that provides greater accuracy improvements than existing CL and TL methods, while maintaining low computational effort. SML outperformed the benchmark of initial fine-tuning by 28.1\% without concept drifts and 34.9\% with concept drifts.
- Europe > Central Europe (0.24)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Slovakia > Bratislava > Bratislava (0.04)
- (7 more...)
- Construction & Engineering > HVAC (1.00)
- Information Technology (0.67)
- Energy > Renewable (0.67)
Adapting Tensor Kernel Machines to Enable Efficient Transfer Learning for Seizure Detection
de Rooij, Seline J. S., Hunyadi, Borbála
Transfer learning aims to optimize performance in a target task by learning from a related source problem. In this work, we propose an efficient transfer learning method using a tensor kernel machine. Our method takes inspiration from the adaptive SVM and hence transfers 'knowledge' from the source to the 'adapted' model via regularization. The main advantage of using tensor kernel machines is that they leverage low-rank tensor networks to learn a compact non-linear model in the primal domain. This allows for a more efficient adaptation without adding more parameters to the model. To demonstrate the effectiveness of our approach, we apply the adaptive tensor kernel machine (Adapt-TKM) to seizure detection on behind-the-ear EEG. By personalizing patient-independent models with a small amount of patient-specific data, the patient-adapted model (which utilizes the Adapt-TKM), achieves better performance compared to the patient-independent and fully patient-specific models. Notably, it is able to do so while requiring around 100 times fewer parameters than the adaptive SVM model, leading to a correspondingly faster inference speed. This makes the Adapt-TKM especially useful for resource-constrained wearable devices.
- Europe > Austria > Vienna (0.14)
- Europe > Netherlands > South Holland > Delft (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (5 more...)
ParaGate: Parasitic-Driven Domain Adaptation Transfer Learning for Netlist Performance Prediction
Sun, Bin, Zhou, Jingyi, Mu, Jianan, Chao, Zhiteng, Yang, Tianmeng, Xu, Ziyue, Ye, Jing, Li, Huawei
In traditional EDA flows, layout-level performance metrics are only obtainable after placement and routing, hindering global optimization at earlier stages. Although some neural-network-based solutions predict layout-level performance directly from netlists, they often face generalization challenges due to the black-box heuristics of commercial placement-and-routing tools, which create disparate data across designs. To this end, we propose ParaGate, a three-step cross-stage prediction framework that infers layout-level timing and power from netlists. First, we propose a two-phase transfer-learning approach to predict parasitic parameters, pre-training on mid-scale circuits and fine-tuning on larger ones to capture extreme conditions. Next, we rely on EDA tools for timing analysis, offloading the long-path numerical reasoning. Finally, ParaGate performs global calibration using subgraph features. Experiments show that ParaGate achieves strong generalization with minimal fine-tuning data: on openE906, its arrival-time R2 from 0.119 to 0.897. These results demonstrate that ParaGate could provide guidance for global optimization in the synthesis and placement stages.
Hypothesis Transfer Learning via Transformation Functions
We consider the Hypothesis Transfer Learning (HTL) problem where one incorporates a hypothesis trained on the source domain into the learning procedure of the target domain. Existing theoretical analysis either only studies specific algorithms or only presents upper bounds on the generalization error but not on the excess risk. In this paper, we propose a unified algorithm-dependent framework for HTL through a novel notion of transformation functions, which characterizes the relation between the source and the target domains. We conduct a general risk analysis of this framework and in particular, we show for the first time, if two domains are related, HTL enjoys faster convergence rates of excess risks for Kernel Smoothing and Kernel Ridge Regression than those of the classical non-transfer learning settings. We accompany this framework with an analysis of cross-validation for HTL to search for the best transfer technique and gracefully reduce to non-transfer learning when HTL is not helpful. Experiments on robotics and neural imaging data demonstrate the effectiveness of our framework.
Quality-Controlled Multimodal Emotion Recognition in Conversations with Identity-Based Transfer Learning and MAMBA Fusion
This paper addresses data quality issues in multimodal emotion recognition in conversation (MERC) through systematic quality control and multi-stage transfer learning. We implement a quality control pipeline for MELD and IEMOCAP datasets that validates speaker identity, audio-text alignment, and face detection. We leverage transfer learning from speaker and face recognition, assuming that identity-discriminative embeddings capture not only stable acoustic and Facial traits but also person-specific patterns of emotional expression. We employ RecoMadeEasy(R) engines for extracting 512-dimensional speaker and face embeddings, fine-tune MPNet-v2 for emotion-aware text representations, and adapt these features through emotion-specific MLPs trained on unimodal datasets. MAMBA-based trimodal fusion achieves 64.8% accuracy on MELD and 74.3% on IEMOCAP. These results show that combining identity-based audio and visual embeddings with emotion-tuned text representations on a quality-controlled subset of data yields consistent competitive performance for multimodal emotion recognition in conversation and provides a basis for further improvement on challenging, low-frequency emotion classes.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States (0.04)
- Asia > India > Telangana > Hyderabad (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
- Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.68)
- Oceania > Australia (0.04)
- North America > United States > California (0.04)
- North America > Canada (0.04)
- (5 more...)